Skip to content

Conversation

@akoumpa
Copy link
Contributor

@akoumpa akoumpa commented Oct 24, 2025

Changes:

  • Surfaces the padding and truncation options in the formatting utils.
  • For the ColumnMapped dataset, includes above options and also adds a fallback if all labels are empty in the labels.

@copy-pr-bot
Copy link

copy-pr-bot bot commented Oct 24, 2025

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@akoumpa akoumpa linked an issue Oct 24, 2025 that may be closed by this pull request
@akoumpa akoumpa force-pushed the akoumparouli/feat_surface_padding_truncation_options branch from a11d569 to 420a23c Compare October 27, 2025 05:44
@akoumpa
Copy link
Contributor Author

akoumpa commented Oct 27, 2025

/ok to test 7d99857

@akoumpa
Copy link
Contributor Author

akoumpa commented Oct 27, 2025

/ok to test 5e30ab6

@akoumpa
Copy link
Contributor Author

akoumpa commented Oct 27, 2025

/ok to test 2e62dc3

Signed-off-by: Alexandros Koumparoulis <[email protected]>
@akoumpa
Copy link
Contributor Author

akoumpa commented Oct 29, 2025

/ok to test 13e625b

@akoumpa akoumpa marked this pull request as ready for review October 29, 2025 05:20
return_dict=True,
return_assistant_tokens_mask=template_has_generation_kwd,
padding=padding,
truncation=truncation,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I understand correctly we truncate regardless of the template structure here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes

Copy link
Contributor

@HuiyingLi HuiyingLi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM thank you!

@HuiyingLi HuiyingLi merged commit 38b330c into main Oct 29, 2025
51 checks passed
@HuiyingLi HuiyingLi deleted the akoumparouli/feat_surface_padding_truncation_options branch October 29, 2025 07:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add sequence truncation support for long sequences

3 participants